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ABSTRACT 

RDF data are used to model knowledge in various areas such as life 
sciences, Semantic Web, bioinformatics, and social graphs. The 
size of real RDF data reaches billions of triples. This calls for a 
framework for efficiently processing RDF data. The core function 
of processing RDF data is subgraph pattern matching. There have 
been two completely different directions for supporting efficient 
subgraph pattern matching. One direction is to develop specialized 
RDF query processing engines exploiting the properties of RDF 
data for the last decade, while the other direction is to develop effi¬ 
cient subgraph isomorphism algorithms for general, labeled graphs 
for over 30 years. Although both directions have a similar goal 
(i.e., finding subgraphs in data graphs for a given query graph), 
they have been independently researched without clear reason. We 
argue that a subgraph isomorphism algorithm can be easily modi¬ 
fied to handle the graph homomorphism, which is the RDF pattern 
matching semantics, by just removing the injectivity constraint. In 
this paper, based on the state-of-the-art subgraph isomorphism al¬ 
gorithm, we propose an in-memory solution, TurbOHOM++, which 
is tamed for the RDF processing, and we compare it with the repre¬ 
sentative RDF processing engines for several RDF benchmarks in a 
server machine where billions of triples can be loaded in memory. 
In order to speed up TurbOHOM++, we also provide a simple yet 
effective transformation and a series of optimization techniques. 
Extensive experiments using several RDF benchmarks show that 
TurbOHOM++ consistently and significantly outperforms the repre¬ 
sentative RDF engines. Specifically, TurbOHOM++ outperforms its 
competitors by up to five orders of magnitude. 

1. INTRODUCTION 

The Resource Description Framework (RDF) is a standard for 
representing knowledge on the web. It is primarily designed for 
building the Semantic web and has been widely adopted in database 
and data mining communities. RDF models a fact as a triple which 
consists of a subject (S), a predicate (P), and an object (O). Due 
to its simple structure, many practitioners materialize their data in 
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an RDF format. For example, RDF datasets are now pervasive in 
various areas including life sciences, bioinformatics, and social net¬ 
works. The size of real RDF data reaches billions of triples. Such 
billion-scale RDF data are fully loaded in main memory of today’s 
server machine (The cost of a 1TB machine is less than $40,000). 

The SPARQL query language is a standard language for query¬ 
ing RDF data in a declarative fashion. Its core function is subgraph 
pattern matching, which corresponds to finding all graph homo- 
morphisms in the data graph for a query graph 11811 . 

In recent years, there have been significant efforts to speed up the 
processing of SPARQL queries by developing novel RDF query 
processing engines. Many engines IlL [PA [la 1^ 1^ |2^ model 
RDF data as tabular structures and process SPAR QL q ueries using 
specialized join methods. For example, RDF-3X llisll treats RDF 
data as an edge table, Edge(S,P, 0), and materializes six differ¬ 
ent orderings for this table, so that it can support many SPARQL 
queries just by using merge based join. Note that this approach 
is efficient for both disk-based and in-memory environments since 
merge join exploits only sequential scans. Some engines 
treat RDE data as graphs (or matrices) and develop specialized 
graph processing methods for processing SPARQL queries. For 
example, gStore (H uses specialized index structures to process 
SPAR QL q ueries. Note that these index structures are based on 
gCode 1331 . which was originally proposed for graph indexing. 

Subgraph isomorphism, on the other hand, has been studied since 
the 1970s. The representative algorithms are VF2 fn . QuickSI 
1^ , GraphQL (illl, GADDI iH], SPATH fH, and TurbOiso 111. 
In order to speed up performance, these algorithms exploit good 
matching orders and effective pruning rules. A recent study d 
shows that good subgraph isomorphism algorithms significantly 
outperform graph indexing based ones. However, all of these al¬ 
gorithms use only small graphs in their experiments, and thus, it 
still remains unclear whether these algorithms can show good per¬ 
formance for billion-scale graphs such as RDF data. 

Although subgraph isomorphism processing and RDF query pro¬ 
cessing have similar goals (i.e., finding subgraphs in data graphs 
for a given query graph), they have two inexplicably different di¬ 
rections. A subgraph isomorphism algorithm can be easily modi¬ 
fied to handle the graph homomorphism, which is the RDF pattern 
matching semantics, just by removing the injectivity constraint. 

In this p^er, based on the state-of-the-art subgraph isomorphism 
algorithm we propose an in-memory solution, TurbOHOM++, 
which is tamed for the RDF processing, and we compare it with 
the representative RDF processing engines for several RDF bench¬ 
marks in a server machine where billions of triples can be loaded 
in memory. We believe that this approach opens a new direction 
for RDF processing so that both traditional directions can merge or 
benefit from each other. 



By transforming RDF graphs into labeled graphs, we can ap¬ 
ply subgraph homomorphism methods to RDF query processing. 
Extensive experiments using several benchmarks show that a di¬ 
rect modification of Turbo iso outperforms the RDF processing en¬ 
gines for queries which require a small amount of graph explo¬ 
ration. Flowever, for some queries which require a large amount 
of graph exploration, the direct modification is slower than some 
of its competitors. This poses an important research question: “Is 
this phenomenon due to inherent limitations of the graph homo¬ 
morphism (subgraph isomorphism) algorithm?” Our profile results 
show that two major subtasks of TurbOiso — 1) exploring candi¬ 
date subgraphs in ExploreCandidateRegion and 2) enumerating 
solutions based on candidate regions in Subgraphs ear ch — re¬ 
quire performance improvement. TurbOHOM-i-i- resolves such per¬ 
formance hurdles by proposing the type-aware transformation and 
tailored optimization techniques. 

First, in order to speed up ExploreCandidateRegion, we pro¬ 
pose a novel transformation fSection lrTT l. called type-aware trans¬ 
formation, which is simple yet effective in processing SPARQL 
queries. In type-aware transformation, by embedding the types of 
an entity (i.e., a subject or object) into a vertex label set, we can 
eliminate corresponding query vertices/edges from a query graph. 
With type-aware transformation, the query graph size decreases, its 
topology becomes simpler than the original query, and thus, this 
transformation improves performance accordingly by reducing the 
amount of graph exploration. 

In order to optimize performance in depth, in both Explore¬ 
CandidateRegion and Subgraphs ear ch, we propose a series 
of optimization techniques (Section l43t , each of which contributes 
to performance improvement significantly for such slow queries. 
In addition, we explain how TurbOHOM-i-+ is extended to support 

1) general SPARQL features such as OPTIONAL, and FILTER, 
and 2) parallel execution for TurbOHOM-i-i- in a non-uniform mem¬ 
ory access (NUMA) architecture Cl El. These general features 
are necessary to execute comprehensive benchmarks such as Berlin 
SPARQL benchmark (BSBM) d. Note also that, when the RDE 
data size grows large, we have to rely on the NUMA architecture. 

Extensive experiments using several representative benchmarks 
show that TurbOHOM++ consistently and significantly outperforms 
all its competitors for all queries tested. Specifically, our method 
outperforms the competitors by up to five orders of magnitude with 
only a single thread. This indicates that a subgraph isomorphism 
algorithm tamed for RDF processing can serve as an in-memory 
RDF accelerator on top of a commercial RDF engine for real-time 
RDF query processing. 

Our contributions are as follows. 1) We provide the first direct 
comparison between RDF engines and the state-of-the-art subgraph 
isomorphism method tamed for RDF processing, TurbOHOM++, thro¬ 
ugh extensive experiments and analyze experimental results in depth. 

2) In order to simplify a query graph, we propose a novel trans¬ 
formation method called type-aware transformation, which con¬ 
tributes to boosting query performance. 3) In order to speed up 
query performance further, we propose a series of performance 
optimizations as well as NUMA-aware parallelism for fast RDF 
query processing. 4) Extensive experiments using several bench¬ 
marks show that the optimized subgraph isomorphism method con¬ 
sistently and significantly outperforms representative RDF query 
processing engines. 

The rest of the paper is organized as follows. Section de¬ 
scribes the subgraph isomorphism, its state-of-the-art algorithms, 
TurbOiso, and their modification for the graph homomorphism. Sec¬ 
tion [3 presents how a direct modification of TurbOiso, TurbOnoM, 
handles the SPARQL pattern matching. Section|4]describes how we 


obtain TurbOHOM-H- from TurbOHOM using the type-aware transfor¬ 
mation and optimizations for the efficient SPARQL pattern match¬ 
ing. Section |5] describes how TurbOHOM++ can handle general 
SPARQL features and discusses the parallelization of Turb0H0M-l-l-■ 
Section|6]reviews the related work. Section[7]presents the experi¬ 
mental result. Finally, Section[8]presents our conclusion. 

2. PRELIMINARY 

2.1 Subgraph Isomorphism and RDF Pattern 
Matching Semantic 

Suppose that a labeled graph is defined as g(V, E, L), where V 
is a set of vertices, E(C U x U) is a set of edges, and L is a labeling 
function which maps from a vertex or an edge to the corresponding 
label set or label, respectively. Then, the subgraph isomorphism is 
defined as follows. 

Definition 1. (H Given a query graph q{V, E, L) and a data 
graph g{V', E', L'), a subgraph isomorphism is an injective func¬ 
tion M \ V ^ V' such that 1) \/v G V, L{v) C L'{M{v)) and 2) 
y{u,v) G E, {M{u),M{v)) G E'3,nd,L{u,v) = L'(M{u), M{v)) 

If a query vertex, u, has a blank label set (or does not spec¬ 
ify vertex label equivalently), it can match any data vertex. Here, 
L(u) = 0, and thus, the subset condition, L(u) C L'(M(u)), is 
always satisfied. Similarly, if a query edge (u, v) has a blank label, 
it can match any data edge by generalizing the equality condition 
L{u, v) = L'{M{u), M{v)) to L{u, v) C L’(M{u), M{v)). 

The graph homomorphism is easily obtained from the sub¬ 
graph isomorphism by just removing the injective constraint on M 
in Definition Ul Even though the RDE pattern matching semantics 
is based on the graph homomorphism, to answer SPARQL queries 
which have variables on predicates, a mapping from a query edge 
to an edge label is also required. We call such graph homomor¬ 
phism the e(xtended)-graph homomorphism and present a formal 
definition for it as follows. 

Definition 2. Given a query graph qiV, E, L) and a data graph 
g{V', E', L'), an e{xtended)-graph homomorphism is a pair of two 
mapping functions, a query vertex to data vertex function M„ : 

U —>■ U' such that 1) Vv G V, L{v) C L'(M^(v)) and2) V(u, v) G 
E, {Mv{u), M^{v)) G E', ^nd L{u,v) = L'Im„{u), Mv{v)), 
and a query edge to edge label function Me : V x V ^ L such 
thatV(ti, w) G E, Me{u,v) = L'(M„(u), M„(d)). 

The subgraph isomorphism problem (resp. the e-graph homo¬ 
morphism problem) is to find all distinct subgraph isomorphisms 
(resp. e-graph homomorphisms) of a query graph in a data graph. 

Figure[T]shows a query qi and a data graph gi. In gi, _ means a 
blank vertex label set or blank edge label. In the subgraph isomor¬ 
phism, there is only one solution - M^ = { (uo,vo), (ui, vi), (u2, 
U2), (u3, V3), (u4, V4)}. In the e-graph homomorphism, there are 
three solutions - M,} = , Ml = {((uq, ui), a), ((uq, U4), b), 

((U2,ui),a), ((u2,U3),a), ((u3, U4), c)}. Ml = {(uo,U2), (ni, 
V3), iu 2 ,V 2 ), (U3,V3), (U4,V5)}, Ml = Ml , Wtd M^ = {(uo,W2), 
(mIjVi), (u2 ,V2), (uSyVs), (U4,V5)}, Ml = Mg. 

2.2 TurbOiso 

In this subsection, we introduce the state-of-the art subgraph iso¬ 
morphism solution, TurbOiso@], and its modification for the e- 
graph homomorphism. Although we only describe the modifica¬ 
tion of TurbOiso for the e-graph homomorphism, such modification 
is applicable to other subgraph isomorphism algorithms including 



(a) query graph gi. (b) data graph gi. 

Figure 1: Example of subgraph isomorphism and e-graph ho¬ 
momorphism. 

VF2 Cl], QuickSI 113], GraphQL tH, GADDI (H, and SPATH 
1 ^ . since all of the subgraph algorithms mentioned are instances 
of a generic subgraph isomorphism framework Q. 

TurbOiso presents an effective method for the notorious match¬ 
ing order problem from which all the previous subgraph isomor¬ 
phism algorithms have suffered Q. Figure [2 illustrates an exam¬ 
ple of the matching order problem, where 52 is the query graph, 
and g 2 is the data grapfQ Note that this example query results 
in no answers. However, the time to finish this query can differ 
drastically by how one chooses the matching order, as it leads to 
different number of comparisons. For instance, a matching order 
< uo,U 2 ,ui,U 3 > requires 1 -|- 10000 * 10 * 5 comparisons while 
a different matching order < uq, « 3 , ui, U 2 > requires only 1-1-5 
* 10 comparisons. 
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(a) query graph 52 . (b) data graph 52 . 

Figure 2: Example of showing the matching order problem. 

TurbOiso solves the matching order problem with candidate re¬ 
gion exploration, a technique that accurately estimates the num¬ 
ber of candidate vertices for a given query path (3). In particu¬ 
lar, TurbOiso first identifies candidate data subgraphs (i.e., candi¬ 
date regions) from the starting vertices (e.g. the shaded area in 
Figure l2bt. then explores each region by performing a depth-first 
search, which allows almost exact selectivity for each query path. 

Algorithm [T] outlines the overall procedure of Turbo iso in de¬ 
tail. First, if a query graph has only one vertex u and no edge, 
it is sufficient to retrieve all data vertices which have «’s labels 
(= V {g)L{u) ) and to find a subgraph isomorphism for each of them 
(lines 2-4-). Otherwise, it selects the starting query vertex from the 
query graph (line 6 ). Then, it transforms the query graph into its 
corresponding query tree (line 7). After getting the query tree, for 
each data vertex that contains the vertex label of the starting query 
vertex, the candidate region is obtained by exploring the data graph 

*For simplicity, we omit the edge labels and allow only one vertex 
label in the data graph. 


(lines 9). If the candidate region is not empty, its matching order 
is determined (line 11). The data vertex, Vs, is mapped to the first 
query vertex its by assigning M{us) = Vs and T’(us) = true 
where F : V ^ boolean is a function which checks whether 
a data vertex is mapped or not (line 12). Then, the remaining 
subgraph matching is conducted (line 13). Lastly, the mapping 
(its, fs) is restored by removing the mapping for its and assign¬ 
ing F{va) = false (line 14). 


Algorithm 1 TurbOiso(<?(K E, L), q{V', E', L')) 

Require: 5 : query graph, g: data graph 
Ensure: all subgraph isomorphisms from q to g. 

1: if H( 5 ) = {it} and E = (f> then 
2: for each v € V{g)L(u) do 

3: report M = {(it, u)} 

4: end for 

5; else 

6 ; Its •<— ChooseStartQueryVertex{q,g) 

7: q' WriteQueryTree{q,Us) 

8; for each Us £ {i’|i’ G V, L{ua) C L{v)} do 
9; CR t— ExploreCandidateRegion{us, Va) 

10: if CR is not empty then 

11: order •<— DetermineMatchingOrder{q',CR) 

12: UpdateState{M,F,Ua,Va) 

13: SubgraphSearch{q, q , g, CR, order, 1) 

14: RestoreState{M,F,Ua,Va) 

15: end if 

16: end for 

17: end if 


ChooseStartQueryVertex. ChooseStartQueryVertex tries 
to pick the starting query vertex which has the least number of 
candidate regions. First, as a rough estimation, the query ver¬ 
tices are ranked by their scores. The score of a query vertex it 
is rank{u) = , where freq{g, L{u)) is the number of 

data vertices that have it’s vertex labels. The score function prefers 
lower frequencies and higher degrees. After obtaining the top- 
k least-scored query vertices, the number of candidate regions is 
more accurately estimated for each of them by using the degree fil¬ 
ter and the neighborhood label frequency (NLF) filter. The degree 
filter qualifies the data vertices which have equal or higher degree 
than their corresponding query vertices. The NLF filter qualifies 
the data vertices which have equal or larger number of neighbors 
for all distinct labels of the query vertex. In FigurejT] for example, 
Mo becomes the starting query vertex since it has the least number 
of candidate regions (= 1 ). 

WriteQueryTree. Next, VFrifeQMerj/Treetransformsthequery 
graph to the query tree. From the starting query vertex obtained by 
ChooseStartQueryVertex, a breath-first tree traversal is con¬ 
ducted. Every non-tree edge (m, v ) of the query graph also is recorded 
in the corresponding query tree. For example, when uo is the 
starting query vertex, the non-tree edges of 52 ’s query tree are 
(mi, U2),(mi, M3), and (m2,M3). 

ExploreCandidateRegion. Using the query tree and the starting 
query vertex, ExploreCandidateRegion collects the candidate 
regions. A candidate region is obtained by exploring the data graph 
from the starting query vertex in a depth-first manner following the 
topology of the query tree. During the exploration, the injectivity 
constraint should be enforced. The shaded area of Figure]^ is the 
candidate region CR{vo) based on 52 ’s query tree. Note that the 
candidate region expansion is conducted only after the current data 
vertex satisfies the constraints of the degree filter and the NLF filter. 















DetermineMatchingOrder. After obtaining the candidate re¬ 
gions for a starting data vertex, the matching order is determined for 
each candidate region. Using the candidate region, Determine¬ 
MatchingOrder can accurately estimate the number of candidate 
vertices for each query path. Then, it orders all query paths in the 
query tree hy the number of candidate vertices. For example, from 
CR{vo), the ordered list of query paths is [uq.ms, uq-Ui, uo.U 2 \. 
Thus, we can easily see that < tto, ua, ui, U 2 > is the best match¬ 
ing order based on this ordered list. 

SubgraphSearch. Exploiting the data structures obtained from 
the previous steps, SubgraphSearch (Algorithm [2ll enumerates 
all distinct subgraph isomorphisms. It first determines the current 
query vertex u from a given matching order order (line 1). Then, 
it obtains a set of data vertices, Cr from a candidate region CR 
(line 2 ). CR{u, v) represents the candidate vertices of a query ver¬ 
tex u which are the children of v in CR, and P{q', u) is the parent 
of u in a query tree q'. For each candidate data vertex v, if v has 
already been mapped, the current solution is rejected since it vio¬ 
lates the injectivity constraint of the subgraph isomorphism (lines 
4—6). Next, by calling IsJoinable, if the query vertex u of the 
current data vertex v has non-tree edges, the existence of the corre¬ 
sponding edges are checked in the data graph (line 7). For example, 
given CR{vo) and the matching order < uo,U 3 ,ui,U 2 >, when 
making the embedding for ui, we must check whether there is an 
edge from M{ui) to M{u 3 ). If the IsJoinable test is passed, 
the mapping information is updated by assigning M (u) = v and 
F{v) = true (line 8 ). After updating the mapping, if all query 
vertices are mapped, a subgraph isomorphism M is reported (lines 
9-10). Otherwise, further subgraph search is conducted (line 12). 
Finally, all changes done by UpdateState are restored (line 14). 


Algorithm 2 SubgraphSearch{q, q , g, CR, order, dc) 

1 : M t— order[dc\ 

2: CR^CR{u,M{P{q',u))) 

3: for each v € Cr such that v is not yet matched do 
4; if F(v) = true then 
5; continue 

6: end if 

7: it 1 sJoinable{q, g, M, u,v,...) then 

8 : UpdateState{M,F,u,v) 

9; if |M| = V{q) then 

10: report M 

11: else 

12: SubgraphSearch(q, q', g, CR, order, dc -I- 1) 

13: end if 

14: RestoreState[M, F, u, v) 

15: end if 

16: end for 


Modifying TurbOiso for e-Graph Homomorphism. We first 
explain how the generic subgraph isomorphism algorithm fH can 
easily handle graph homomorphism. The generic subgraph isomor¬ 
phism algorithm is implemented as a backtrack algorithm, where 
we find solutions by incrementing partial solutions or abandon¬ 
ing them when it is determined that they cannot be completed. 
Here, given a query graph q and its matching order iUa(i), Uct( 2 ). 
..., trCT(|y((j)|)). a solution is modeled as a vector v = (M{Ua{i)), 
M{u„( 2 )), ..., M{u„(p/{q)\))) where each element in w is a data 
vertex for the corresponding query vertex in the matching order. At 
each step in the backtrack algorithm, if a partial solution is given, 
we extend it by adding every possible candidate data vertex at the 
end. Here, any candidate data vertex that does not satisfy the fol¬ 
lowing three conditions must be pruned. 


1) Vui G V(q),L{ui) C L{M{ui)) 

2) W{ui,Uj) G E{q),{M{ui),M(uj)) G E{g) a.nA-L{ui,Uj) = 
L{M{ui),M{uj)) 

3) M{ui) / M(uj) if Ui 7 ^ Uj 

Note that the third condition ensures the injective condition, guar¬ 
anteeing that no duplicate data vertex exists in each solution vector. 
Thus, by just disabling the third condition, the generic subgraph 
isomorphism algorithm finds all possible homomorphisms. 

Now, we describe how to disable the third condition in TurbOiso, 
which is an instance of the generic subgraph isomorphism algo¬ 
rithm. TurbOiso uses pruning rules by applying filters in Explore- 
CandidateRegion and SubgraphSearch. First, the degree filter 
and the NLF filter should be modified since a data vertex can be 
mapped to multiple query vertices. The degree filter qualifies data 
vertices which have an equal number or more neighbors than dis¬ 
tinct labels of their corresponding query vertices. The NLF filter 
qualifies data vertices which have at least one neighbor for all dis¬ 
tinct labels of their corresponding query vertices. Second, lines 
4—6 of SubgraphSearch ensuring the third condition should be 
removed in order to disable the injectivity test. As we see here, 
with minimal modification to TurbOiso, it can easily support graph 
homomorphism. 

In order to make TurbOiso handle the e-graph homomorphism, 
the query edge to edge label mapping. Me, should be addition¬ 
ally added in SubgraphSearch. For this, UpdateState assigns 
Me{P{q' , u), u) = L(Mv(P(q' , u)), M„{u)) , and RestoreState 
removes such mapping. From here on, let us denote TurbOiso mod¬ 
ified for the e-graph homomorphism as TurbOnoM- 

3. RDF QUERY PROCESSING BY E-GRAPH 
HOMOMORPHISM 

In this section, we discuss how RDF datasets can be naturally 
viewed as graphs (Section [3.1l , and thus how an RDF dataset can 
be directly transformed into a corresponding labeled graph (Sec¬ 
tion [ua . After such a transformation, henceforth, the subgraph 
isomorphism algorithms modified for the e-graph homomorphism 
such as TurbOHOM can be applied for processing SPARQL queries. 

3.1 RDF as Graph 

An RDF dataset is a collection of triples each of which consists 
of a subject, a predicate, and an object. By considering triples as 
directed edges, an RDF dataset naturally becomes a directed graph: 
the subjects and the objects are vertices while the predicates are 
edges. Figure is a graph representation of triples that captures 
type relationships between university organizations. Note that we 
use rectangles to represent vertices in RDF graphs to distinguish 
them from the labeled graphs. 


rdf:subClassOf 



Figure 3: RDF graph. 






























3.2 Direct Transformation 

To apply subgraph isomorphism algorithms modified for e-graph 
homomorphism (e.g. TurbOHOivi) for RDF query processing, RDF 
graphs have to be transformed into labeled graphs first. 

The most basic way to transform RDF graphs is (1) to map sub¬ 
jects and objects to vertex IDs and (2) to map predicates to edge la¬ 
bels. We call such transformation the direct transformation because 
the topology of the RDF graph is kept in the labeled graph after the 
transformation. The vertex label function L{v)(v £ V{g)) is the 
identity function (i.e. L{v) = {it}). 

Figure|4]shows the result of the direct transformation of Figure]^ 
- Figures [Ta] 140 and |4^ are the vertex mapping table , the edge 
label mapping table, and the transformed graph, respectively. 


Subject/Object 

Vertex 


Predicate 

Edge Label 

GraduateStudent 



rdf:type 

a 

Student 

v-i 


rdf:subClassOf 

b 

University 

V2 


undergradDegreeFrom 

c 

Department 

V3 


memberOf 

d 

student 1 

V4, 


subOrganizationOf 

e 

univl 

V5 


telephone 

f 

deptl.univl 

Vq 


emailAddress 

g 

‘012-345-6789’ 

‘ j ohn @ dep 11. uni v 1. edu ’ 

V7 

(b) edge label mapping table 


(a) vertex mapping table. 

VI VO 



Figure 4: Direct transformation of RDF graph (Vertex label 
function L{v) = {f}). 

A query graph is obtained from a SPARQL query. A query vertex 
may hold the vertex label which corresponds to the subject or object 
specified in the SPARQL query. If the query vertex corresponds to 
a variable, the vertex label is left blank. For example, the SPARQL 
query of Figure|5a]is transformed into the query graph of Figure[50 
Here the query vertex uo, which corresponds to Student, holds 
the vertex label {ui}; To the contrary, the query vertex its, which 
corresponds to the variable X, has blank (_) as the vertex label. 
Similarly, a query edge may hold the edge label which corresponds 
to the predicate. For example, the edge label of (us, U 4 ) is c as the 
edge corresponds to the undergradDegreeFrom predicate. 


SELECT ?X, ?Y, ?Z WHERE 
{?X rdf:type Student . 

?Y rd.f:type University . 
?Z rd.f:type Department . 
?X undergradDegreeFrom ?Y 
?X memberOf ?Z . 

?Z subOrganizationOf ?Y.} 


(a) SPARQL query. 



(b) query graph. 


Figure 5: Direct transformation of SPARQL query. 


Note that, when a variable is declared on a predicate in a SPARQL 
query, a query edge has a blank edge label. An e-graph homo¬ 
morphism algorithm can answer such SPARQL queries since an 


e-graph homomorphism has edge label mapping from query edges 
to their corresponding edge labels. 

Consequently, the direct transformation makes it possible to ap¬ 
ply conventional e-graph homomorphism algorithms for process¬ 
ing SPARQL queries. In order to evaluate the performance of such 
an approach, we applied TurbOHOM on LUBM8000, a billion-triple 
RDF dataset of Leihigh University Benchmark (LUBM) @], after 
applying direct transformation. We compared the performance of 
TurbOHOM against two existing RDF engines: RDF-3X in, and 
System- jfl Figure [^depicts the measured execution time of these 
three systems in log scale. (See Section mi for the details of the 
experiment setup) 



Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 QIO Qll Q12 Q13 Q14 


Figure 6: Comparison between original TurbOhOM with tbe di¬ 
rect transformation graph and other RDF engines. 

Although there is no clear winner among them, the figure re¬ 
veals that TurbOHOM performs as good as the existing RDF en¬ 
gines. For short-running queries (i.e Ql, Q3-Q5, Q7, Q8, QIO- 
Q13), TurbOHOM shows faster elapsed time. As those queries spec¬ 
ify a data vertex ID, TurbOnoM only needs a small amount of graph 
exploration from one candidate region with an optimal matching 
order, while RDF-3X and System-X require expensive join op¬ 
erations. For long-running queries (i.e., Q2, Q6, Q9, and Q14), 
TurbOnoM is slower than some of its competitors. The performance 
of TurbOnoM largely relies on 1) graph exploration by ExploreCan- 
didateRegion and 2) subgraph enumeration by Subgraphs ear ch. 
Moreover, when a query graph has non-tree edges, IsJoinable 
constitutes a large portion of SubgraphSearch. The profiling re¬ 
sults of long running queries confirmed that 1) ExploreCandidate- 
Region and SubgraphSearch are the dominating factors and 2) 
for queries which have non-tree edges (Q2 and Q9), IsJoinable is 
the dominating factor of SubgraphSearch. Specifically, TurbOnoM 
spent the most time on ExploreCandidateRegion (e.g. 46% for 
Q2, 70% for Q6, 72% for Q9, and 69% for Q14) and Subgraph- 
Search (e.g. 54% for Q2, 30% for Q6, 28% for Q9, and 31% for 
Q14). Moreover, for queries which have non-tree edges, the most of 
SubgraphSearch time was spent on IsJoinable (e.g. 81.4% for 
Q2 and 77.6% for Q9). In order to speed up ExploreCandidate¬ 
Region, we propose a novel transformation (Section|4T](. Tailored 
optimization techniques are proposed for improving performance 
for both functions (Section r4.3b . 

4. TURBOhom-h- 

In this section, we propose an improved e-graph homomorphism 
algorithm, TurbOHOM++- Introduced first is the type-aware trans¬ 
formation, which can result in faster pattern matching than direct 
transformation (Section [4.1b . TurbOHOM++ processes the labeled 
graph transformed by the type-aware transformation (Section|4j2j. 
Furthermore, for efficient RDF query processing, four optimiza¬ 
tions are applied to TurbOHOM++ (Section r4.3b . 

^We anonymize the product name to avoid any conflict of interest. 

































4.1 Type-aware Transformation 

To enable the type-aware transformation, we devise the two- 
attribute vertex model which makes use of the type information 
specified by the rdf : type predicate. Specifically, this model as¬ 
sumes that each vertex is associated with a set of labels (the label 
attribute) in addition to its ID (the ID attribute). The label attribute 
is obtained by following the rdf rtype predicate - if a subject 
has one or more rdf : type predicates, its types can be obtained 
by following the rdf : type (as well as rdf : subClassOf pred¬ 
icates transitively). For example, studentl in Figure[3has the 
label attribute, {GradStudent, Student}. 

The above two-attribute vertex model naturally leads to our new 
RDF graph transformation, the type-aware transformation. Here, 
subjects and objects are transformed to two-attribute vertices by 
utilizing rdf :type predicates as described above. Then, the ID 
attribute corresponds to the vertex ID, and the label attribute corre¬ 
sponds to the vertex label. Figure [7] shows an example of the map¬ 
ping tables and the data graph, which is the result of type-aware 
transformation applied to Figure Now, we formally define the 
type-aware transformation as follows. 

Definition 3 . The type-aware transformation (FV, Fm, Fe,Fvl, 
Fel) converts a set of triples T{S,P,0) to a type-aware trans¬ 
formed graph G{V, E, ID, L). Let us divide T into three disjoint 
subsets whose union is T — T'{S', P',0'), T{{S{, PfO't) = 
{(s,rdf:type, o) G T}, andT',.{S'^c, PL, O'^c) = {(s, rdfisubClassOf, 
o) G T}. 

1. A vertex mapping Fy '■ S' U O' U S{ —>■ V, which is bijective, 
maps a subject in S' U S{ or an object in O' to a vertex. 

2. A vertex ID mapping Fm ■■ S' (J O' (J S{ ^ N U {_}, which is 
bijective, maps a subject in S' U S't or an object in O' to a vertex 
ID or blank. Here, F/d (a:) = - if x is a variable. 

3. An edge mapping Fe T' ^ E, which is bijective, maps a 
triple ofT' into an edge, F'b(s,p, o) = {Ey (s), Ey (o)). 

4. A vertex label mapping EyL '■ 0{ U 0{c —>■ VLU {_}, which is 
bijective, maps an object of 0{ U 0{c into a vertex label. Here, 
FyL{x) = _ if x is a variable. 

5. An edge label mapping Fel ■ P' —>■ EL U {_}, which is 
bijective, maps a predicate of P' into an edge label. Here, 
Fel{x) = _ if x is a variable. 

6 . A vertex ID mapping function ID •. V N maps a vertex to a 
vertex ID where ID{v) = Pid ° F\7^('f). 

7. A labeling function FI) maps a vertex to a set of vertex labels 
such that u G V, L(v) = {F'vi,(o)| there is a path from Ey^{v) 
to o using triples in T/ U T},;} and 2) maps an edge e to an edge 
label such that e G E, L{e) — ELE{Pred{E^^{e))) where 
Pred{s,p, o) = p. 

After finding a type-aware transformation {Ey ,F'je,E'e,F{,e, 
IcEee) for a data graph g{V',E', L',ID'), we can also convert a 
SPARQL query into a type-aware transformed query graph q{V, E, 
L, ID) by using another type-aware transformation {Fy,EiD, Ee, 
EyL, Fel) such that Fm ~ F'j^, FyL = FyL, and Fel = 
F'el- For example. Figure]^ is the query graph type-aware trans¬ 
formed from the SPARQL query in Figure |5a] Note that a query 
vertex may have multiple vertex labels like a data vertex. 

Now, we explain how the generic e-graph homomorphism algo¬ 
rithm works for type-aware transformed query/data graphs. When 
appending a candidate data vertex to the current partial solution, 
we additionally check the following condition for the ID attribute 
of the two-attribute vertex model. 


Subject/Object 

Vertex ID 

studentl 

0 

univl 

1 

deptl.univl 

2 

‘012-345-678’ 

3 

‘John @ dept 1 .univ 1 .edu' 

4 


(a) vertex ID mapping table. 


Predicate 

Edge Label 

undergradDegreeFrom 

a 

memberOf 

b 

subOrganizationOf 

c 

telephone 

d 

emailAddress 

e 


(c) edge label mapping table. 


Type 

Vertex Label 

GraduateStudent 

A 

Student 

B 

University 

c 

Department 

D 


(b) vertex label mapping 
table. 



Figure 7: Type-aware transformation of an RDF graph. 



Figure 8: Type-aware transformation of SPARQL query of Fig- 
ure|5^ 


Mu G {u\ID{u) / _forw G V},ID{u) = ID'{M„{u)). 

The virtue of the type-aware transformation is that it can improve 
the efficiency of RDF query processing. Since the type-aware trans¬ 
formation eliminates certain vertices and edges by embedding type 
information into the vertex label, the resulting data/query graphs 
have smaller size and simpler topology than those transformed by 
the direct transformation. 

As an example, let us consider the SPARQL query in Figure [5^ 
After direct transformation, it becomes the query graph in FigurelSbl 
that has a relatively complex topology consisting of six vertices 
and six edges. On the other hand, the type-aware transformation 
produces the query graph in Figure [8] that has a simple triangle 
topology. This reduced number of vertices and edges has a positive 
effect on efficiency because it results in less graph exploration. 

In general, the effect of the type-aware transformation can be de¬ 
scribed in terms of the number of data vertices in all candidate re¬ 
gions. Consider a SPARQL query which consists of a set of triples 
T, its direct transformed query graph q{V, E, L), and its type- 
aware transformed query graph q'{V',E', ID',L'). Let Otype = 
{o|(s, rdf : type, o) G T or (s, rdf : subClassQf, o) G Tj. In the 
direct transformation, o G Otype is transformed to a query vertex. 
Let Vtype a set of direct transformed query vertices from Otype- 
However, in the type-aware transformation, o G Otype is not trans¬ 
formed to a query vertex, which satisfies \V'\ = |IF| — |Vtype|. 
Therefore, the type-aware transformation leads to less graph ex¬ 
ploration in ExploreCandidateRegion and SubgraphSearch. 
Formally, using the type-aware transformation, the number of data 
vertices in all candidate regions is reduced by 

^ ^ \CR„^{u)\ 

where Vs represents the starting data vertex for each candidate re¬ 
gion, and CRv, (u) represents a set of data vertices in a candidate 
region CR{vs) that correspond to u. 



















4.2 Implementation 

TurbOHOM++ maintains two in-memory data structures - the in¬ 
verse vertex label list and the adjacency list. Figure |9a] shows the 
inverse vertex label list of Figure The ‘end offsets’ records 
the exclusive end offset of the ‘vertex IDs’ for each vertex label. 
Figure!^ shows the adjacency list of FigurelVdlfor the outgoing 
edges. The adjacency list stores the adjacent vertices for each data 
vertex in the same way as the inverse vertex label list. One differ¬ 
ence is that the adjacency list has an additional array (‘end offsets’) 
to group the adjacent vertices of a data vertex for each neighbor 
type. Here, the neighbor type refers to the pair of the edge label 
and the vertex label. For example, vo in Figure [T^ has four dif¬ 
ferent neighbor types - (a, C), {b, D), {d, _) and (e, _). Those four 
neighbor types are stored in ‘end offsets,’ and each entry points to 
the exclusive end offset of the ‘adjacent vertex ID’. TurbOHOM++ 
maintains another adjacency list for the incoming edges. 

We assume that graphs in our system are periodically updated 
from an underlying RDF source. For efficient graph update, a trans¬ 
actional graph store is definitely required. We leave this exploration 
to future work since it is beyond the scope of the paper. 

Note also that TurbOHOM++ can also handle SPARQL queries 
under the simple entailment regime correctly. In order to deal with 
the simple entailment regime in the type-aware transformed graph, 
TurbOHOM++ distinguishes Lsimpieiv) = {Fl^ (o)|there is an edge 
from Fy^{v) to o using triples in T'} from L{v). TurbOHOM++ can 
process a SPARQL query under the simple entailment regime using 
Lsimpie{v) instead of L{v). 

A B C D 


all V{g)i. Additionally, when a data vertex ID v is specified in u, 
freq{g, L{u)) = 1 if u € V{g)i for each I € L{u). Otherwise, 
freq{g,L{u)) = 0. 

One last case is when a SPARQL query has a query vertex which 
has no label or ID at all. In order to handle such queries, we main¬ 
tain an index called the predicate index where a key is a predicate, 
and a value is a pair of a list of subject IDs and a list of object IDs. 
This index is used to compute freq{g, L{u)). 

ExpIoreCandidateRegion. After a query tree is generated, can¬ 
didate regions are collected by exploring the data graph in an in¬ 
ductive way. In the base case, all data vertices that correspond to 
the start query vertex are gathered in the same way of computing 
freq{g,L{u)). In the inductive case, once the starting data ver¬ 
tices are identified, the candidate region exploration continues by 
exploiting the adjacency information stored in the adjacency list. 
If one vertex label and one edge label are specified in the query 
graph, we can get the adjacent data vertices directly from the adja¬ 
cency list. If multiple vertex labels and one edge label are specified, 
we collect the adjacent data vertices for each vertex label using the 
adjacency list, and intersect them. In a case where the vertex label 
or edge label is blank, TurbOHOM++ finds the correct adjacent data 
vertices by 1) collecting all adjacent vertices which match avail¬ 
able information (either vertex label or edge label) and 2) unioning 
them. Additionally, if the current query vertex has the data vertex 
ID attribute, we check whether the specified data vertex is included 
in the data vertices collected from the adjacency list. 

Isjoinable. The IsJoinable test is equivalent to the inductive 
case of ExpIoreCandidateRegion when a data vertex ID (previ¬ 
ously matched data vertex) is specified. 


end offsets 
vertex IDs 


(a) inverse label vertex list. 



A 


C 


end offsets of label groups 
end offsets 


adjacent vertex IDs 



a(:^'(vo,(b,D)) 


(b) adjacency list. 


Figure 9: In-memory data structures for type-aware trans¬ 
formed data graph of Figure |^(ad;(ti) : adjacent vertices of 
V, adj(v, {el, vl)) : adjacent vertices v, which have vertex label 
vl and are connected with edge label el). 


As the overall behavior of TurbOHOM++ is similar to TurbOHOM, 
here, we describe how TurbOHOM++ uses the data structures in 
ChooseStartQueryVertex (line 6 of Algorithm [Til, ExpIore¬ 
CandidateRegion (line 9 of Algorithm[Tll, and Isjoinable (line 
7 of Algorithmic- 

ChooseStartQueryVertex. When computing rank{u) for a 
query vertex u, the inverse vertex list is used to get freq{g, L{u)) 
(= I nigL(u) ^(s)*!) where V{g)i is the set of vertices having ver¬ 
tex label 1. When |L(ti)| = 1, Getting the start and end offset 
of a specific vertex label is enough. When \L{u)\ > 1, for each 
I £ \L{u)\, all data vertices having I, V{g)i, are retrieved from the 
inverse vertex list, and freq{g, L{u)) is obtained by intersecting 


4.3 Optimization 

In this subsection, we introduce optimizations that we apply to 
improve the efficiency of TurbOHOM++. Even though these opti¬ 
mizations do not change TurbOHOM++ severely, they could improve 
the query processing efficiency quite significantly. 

Use intersection on Isjoinable test (+INT). We optimize the 
Isjoinable test in SubgraphSearch. SubgraphSearch calls 
the Isjoinable test by multiple membership operations. However, 
the optimization allows a bulk of I sJoinable tests with one fc-way 
intersection operation where k is the number of edges between the 
current query vertex, u in line 1 of Algorithmic the previously 
matched query vertices connected by non-tree edges. 

SubgraphSearch checks the existence of the edges between 
the current candidate data vertex and the already bounded data ver¬ 
tices by calling Isjoinable (line 7 of Algorithmic when the corre¬ 
sponding query graph has non-tree edges. Let us consider the query 
graph (Figure IC, the query tree (Figure [Toj and a data graph (Fig- 
urellH. Suppose that, for a given matching order ui ^ U 2 ^ uo, 
the vertex ui is bound to ui, and the vertex V 2 is bound to U 2 - 
Then, the next step is to bind a data vertex to uq . Because there 
is a non-tree edge between ug and U 2 , to bind a data vertex of ID 
Vi(i = 0, 3,4, • • • , 1001) to Uo, we need to check whether there 
exists an edge Vi V 2 . 



Figure 10: A query tree of the query graph of FigurelH 



































VI 



Figure 11: An example data graph for illustrating +INT. 

IsJoinable checks for the existence of the edge between the 
current data vertex and already matched data vertices by repeti¬ 
tively calling IsJoinable. Let us consider the above example. For 
each Ui(i = 0, 3, 4 , • • • , 1001), IsJoinable tests whether the edge 
Vi —> V 2 exists. If V 2 is a member of vjs outgoing adjacency list, 
the test succeeds, and the graph matching continues. 

Instead, our modified IsJoinable tests all the edge occurrences 
between the current candidate vertices {Cr in line 3 of Algo¬ 
rithmic and the adjacency lists of the already matched data vertices 
by one fc-way intersection operation. Let us consider the above ex¬ 
ample again. The modified IsJoinable finds the edge between V 2 
and the candidate data vertices vo,V 3 , ■ ■ ■ , uiooi at once. For this, 
it is enough to perform one intersection operation between the ti 2 ’s 
incoming adjacency vertices and the candidate data vertices. Since 
the modified IsJoinable takes Cr as a parameter, the lines 3 and 
7 of Algorithm 1C are merged into one statement. 

Note that this optimization can improve the performance sig¬ 
nificantly. In the above example, since only vo and viooi pass 
the test, we can avoid calling the original IsJoinable 998 times. 
Formally speaking, let us denote 1) the candidate data vertex set 
for the current query vertex u as Cr, 2) the previously matched 
query vertex set, which is connected to the current query vertex by 
non-tree query edges, as and 3) the adjacent vertex set of 

v'i(= My{u'i)) where u) is connected to u with the vertex label vh 
and the edge label eh, as adj(v[, vh, eh). Suppose that Cr and 
adj(v[, vh, eh) are stored in ordered arrays. Then, the complexity 
of the original IsJoinable test is 

k 

Coriginal — 0 (|C*Jj| ■y^\og\adj{v'i,vh,eh)\) 

i = l 

, since IsJoinable is called for each v £ Cr, and 0(log \adj 
{vi,vh,eh)\) time is required to conduct a binary search for 
\adj{v'i,vh,eh)\ elements. On the contrary, the complexity of the 
modified IsJoinable test is 

k 

min( 0 (|C'H| \adj{v'i,vh,eh)\), Coriginal) 

since the modified IsJoinable can choose the fc-way intersections 
strategy between scanning (fc -|- 1) sorted lists and performing bi¬ 
nary searches. 

Disable NLF Filter (-NLF). The second optimization is to dis¬ 
able the NLF filter in ExploreCandidateRegion. The NLF 
filter may be effective when the neighbor type are very ir¬ 
regular. However, in practice, most RDF datasets are struc¬ 
tured US. For example, in our sample RDF dataset (Fig¬ 
ure 1 ^, in most case, a vertex corresponding to a graduate 
student has telephone, emailAddress, memberOf, and 
undergraduateDegreeFrom predicates. Accordingly, the 
NLF filter is not helpful for such structured RDF datasets. 

Disable Degree Filter (-DEG). The third optimization is to dis¬ 
able the degree filter in ExploreCandidateRegion. Similar to 


the NLF filter, the degree filter is effective when the degree is very 
irregular while RDF datasets typically are not. 

Reuse Matching Order (+REUSE). The last optimization is 
to reuse the matching order of the first candidate region for all the 
other candidate regions. That is, DetermineMatchingOrder 
(line 6 of Algorithm[Tll is called only once throughout the TurbOiso 
execution, and the same matching order is used throughout the 
query processing. TurbOHOM++ uses a different matching order for 
each candidate region, because each candidate region could have a 
very different number of candidate vertices for a given query path 
in the e-graph homomorphism problems. However, typical RDF 
datasets are regular at the schema level, i.e. well structured in prac¬ 
tice, and generating the matching order for each candidate region 
is ineffective, especially when the size of each candidate region is 
small. We also performed experiments with more heterogeneous 
datasets, including Yet Another Great Ontology (YAGO) 1221 . and 
Billion Triples Challenge 2012 (BTC2012) iflCT . This optimization 
technique still shows good matching performance as we will see in 
our extensive experiments in Section |7] since these heterogeneous 
datasets do not show extreme irregularity at the schema level. 

Let us take the example of the expanded RDF data from Figure[3 
and the query graph of Figure [ 8 ] Suppose that the query tree is 
Figure [To] In that case, the starting data vertices of the candidate 
regions are the data vertices with the University vertex label. 
To make a candidate region for a starting data vertex, univl, we 
must find (1) all departments in univl and (2) all the graduate stu¬ 
dents who got their undergraduate degrees from univl. Because 
the selectivity of ( 2 ) is higher than that of ( 1 ), the matching order. 
Ml ^ U 2 —>■ uo is chosen. For the other universities (starting ver¬ 
tices), it is rare that the selectivity of ( 2 ) is higher than that of ( 1 ). 
For such a case, it is more efficient to reuse the first matching order. 

5. FURTHER IMPROVEMENT OF 
TURBOiroM++ 

In this section, we briefly describe how TurbOHOM++ can handle 
the general SPARQL keywords (Section |5.11 . and how it can be 
parallelized under NUMA architecture (Section [5.21 . 

5.1 Supporting General SPARQL Keywords 

Along with the basic graph pattern matching, we briefly describe 
how an e-graph homomorphism algorithm can handle the general 
SPARQL keywords - OPTIONAL, FILTER, and UNION. Thus, 
TurbOhOM-n- can supmrt the explore use case queries of the Berlin 
SPARQL benchmark @1 using OPTIONAL, FILTER, and UNION. 

OPTIONAL. To support queries that ask information which 
is not necessarily required, the OPTIONAL keyword is used. 
Eigure [T^ is an example of such a query which finds the price 
of <productl> and its rating and its homepages if possible. 
To handle OPTIONAL in TurbOHOM++, we propose a simple yet 
effective technique as follows. First, TurbOHOM++ selects a start 
query vertex which is not specified in an OPTIONAL clause. 
Then, TurbOHOM++ makes a candidate solution using the nullify- 
and-keep-searching strategy. In ExploreCandidateRegion, 
if the current query vertex is in an OPTIONAL clause, and no 
data vertex is matched, TurbOhOM-n- nullifies the current query 
vertex mapping in a candidate region. In SubgraphSearch, 
even though the mapped data vertex is nullified, if the corre¬ 
sponding query vertex is in an OPTIONAL clause, it invokes a 
recursive call. After a candidate solution is constructed using 
nullify-and-keep-searching, TurbOHOM-i-i- qualifies it using the 
qualify-and-exclude-duplicate strategy. The OPTIONAL seman¬ 
tics enforces that all vertices in an OPTIONAL clause must be 




mapped to data vertices, otherwise, the mappings of all query ver¬ 
tices in an OPTIONAL clause are nullified. Also, when all vertices 
in an OPTIONAL clause are nullified, the nullified final mapping 
should be generated only once. TurbOHOM++ excludes the dupli¬ 
cates by comparing the current final mapping with the previous 
valid mapping. For example, suppose two successive solutions of 
the example query are {(price, $ 100 ) , (rating, 5 ), 
(homepage,null)}, and { (price,$1 00 ), 

(rating, 1), (homepage, null) }. The preceding final 
solution is qualified as { (price, $1 00 ), (rating, null), 
(homepage, null)}, but the latter solution is dropped 
because it is the same as the preceding final solution. The qualify- 
and-exclude-duplicate strategy is recursively applied to handle the 
nested OPTIONAL clauses. 

SELECT ?price ?rating ?homepage WHERE 

{ <productl> rdf:type <Product>. <productl> price ?price. 
OPTIONAL {<productl> rating ?rating. 

<productl> homepage ?homepage.} } 

Figure 12: A SPARQL query which has an OPTIONAL key¬ 
word. 

FILTER. To restrict solutions that do not qualify conditions, 
the FILTER keyword is used. Figure [T^ is an example of such 
a query which finds all products which have a higher rating than 
<product 1>. To handle FILTER expressions, inexpensive filters 
such as selection conditions are applied whenever we access the 
corresponding vertices, while expensive filters such as join condi¬ 
tions and regular expressions are applied after we find a solution 
without these expensive filters. 

SELECT ?product WHERE 

{ <productl> rdf:type <Product>. <productl> rating ?rl. 
?product rdf:type <Product>. ?product rating ?r2. 
FILTER(?r2 > ?rl) } 

Figure 13: A SPARQL query which has a FILTER keyword. 

UNION. In SPARQL, to support the alternative pattern match¬ 
ing, the UNION keyword is used. Figure [14] is an example of 
such a query which finds products having either <featurel> 
or <feature2>. To handle the UNION keyword, the SPARQL 
query is split into sub-queries, and an e-graph homomorphism al¬ 
gorithm solves each sub-query. Then, the final solutions are the 
union of the sub-queries’ solutions, as the semantic of the UNION 
keyword does not remove duplicated items. 

SELECT ?product WHERE 

{ {?product rdfttype <Product>. ?P hasFeature <featurel>.} 
UNION 

{Pproduct rdf:type <Product>. ?P hasFeature <feature2>.} } 

Figure 14: A SPARQL query which has a UNION keyword. 


5.2 Parallel Processing 

After generating the query tree (line of Algorithm]!), each start¬ 
ing data vertex can be processed independently - including can¬ 
didate region exploration, matching order determination and sub¬ 
graph search (lines 9 - 15). Therefore distributing a subset of 
the starting data vertices to each thread is enough to parallelize 
TurbOHOM++- 

Distributing Starting Data Vertices. However, distributing the 
starting data vertices in a pre-determined way may lead to work¬ 
load imbalance on threads. Although RDF datasets are regular at 


the schema-level, the cardinalities of one (many)-to-many relation¬ 
ships at the instance-level can significantly vary in candidate re¬ 
gions. This property even holds for joins in relational databases. 
For example, in Figure]^ the query involves three types. Univer¬ 
sity, Graduate students, and Departments at the schema level while 
universities can have significantly different numbers of graduated 
students and departments, which leads to different workload for 
each university vertex. To have as even a workload for each thread 
as much as possible, we assign a small chunk of the starting data 
vertices to threads dynamically. 

NUMA-aware Parallelization. The modem high-end worksta¬ 
tions adopt the NUMA architecture to maximize the parallelism by 
using the multi-socket systems However, to fully utilize 

parallelism NUMA provides, a parallel method should avoid re¬ 
mote memory access which retrieves data stored in a remote socket. 
To avoid that, first, each page of a data graph is allocated in sock¬ 
ets’ local memory in a round-robin way. With this, each thread can 
expect uniform access latency for a data graph. Second, a thread is 
enforced to stick to a specific socket, and thread specific data struc¬ 
tures are allocated in the same socket where the thread runs. By 
doing so, a thread accesses its own data structures without remote 
memory access. 

6. RELATED WORK 

With the increasing popularity of RDF, the demand for SPARQL 
support in relational databases is also growing. To meet such de¬ 
mand, most open-source and commercial relational databases sup¬ 
port the RDF store and the RDF query processing. RDF datasets 
are stored into relational tables with a set of indexes. After that, 
SPARQL queries are processed by translating them into the equiv¬ 
alent join queries or by using special APIs. 

To support RDF query processing, many specialized stores for 
RDF data were proposed y,0,[I3.[Il.[lE[2a]- Similar to RDBMS, 
RDF-3X (13, 111] treats RDF triples as a big three-attribute ta¬ 
ble, but boosts the RDF query processing by building exhaus¬ 
tive indexes and maintaining statistics. RDF-3X processes many 
SPARQL queries by using merge based join, which is efficient for 
disk-based and in-memory environments. Different from RDF-3X, 
Jena exploits multiple-property tables, while BitMat (2] ex¬ 
ploits 3-dimensional bit cube, so that it can also support 2D ma¬ 
trices of SO, PO, and PS. H-RDF-3X fl^ is a distributed RDF 
processing engine where RDF-3X is installed in each cluster node. 

Several graph stores support RDF data in their native graph stor¬ 
ages 1^ [ 3 ^ . gStore 13^ performs graph pattern matching us¬ 
ing the filter-and-refinement strategy. It first finds promising sub¬ 
graphs using the VS‘-tree index. After that, the exact subgraphs 
are enumerated in the refinement step. Trinity.RDF is a sub¬ 
system of a distributed graph processing engine. Trinity I 2 II 1 . The 
RDF triples are stored in Trinity’s key-value store. When process¬ 
ing RDF queries, Trinity.RDF implements special query processing 
methods for RDF data. 

In 1976, Ullmann f23ll published his seminal paper on the sub¬ 
graph isomorphism solution based on backtracking. After his work, 
many subgraph isomorphism methods were proposed to improve 
the efficiency by devising their own matching order selection al¬ 
gorithms and filtering constraints Among 

those improved methods. Turbo iso 13] solves the notorious match¬ 
ing order problem by generating the matching order for each can¬ 
didate region and by grouping the query vertices which have the 
same neighbor information. The method shows the most efficient 
performance among all representative methods. 

Along with the backtracking based methods, the index-based 
subgraph isomorphism methods were also proposed 


[^ . All of those methods first prune out unpromising data graphs 
using low-cost filters based on the graph indexes. After filtering, 
any subgraph isomorphism methods can be applied to those un¬ 
filtered data graphs. This technique is only useful when there are 
many small data graphs. Thus, these index-based subgraph isomor¬ 
phism methods do not enhance RDF graph processing since there 
is only one big graph in an RDF database. 

7. EXPERIMENTS 

We perform extensive experiments on large-scale real and syn¬ 
thetic datasets in order to show the superiority of a tamed subgraph 
isomorphism algorithm for RDF query processing. In the experi¬ 
ment, we use TurbOHOM-i-i-- We assume that TurbOHOM uses direct 
transformation, while TurbOHOM-i-i- uses type-aware transforma¬ 
tion along with all optimizations. The specific goals of the exper¬ 
iments are 1) We show the superior performance of TurbOhOM-i-t 
over the state-of-the-art RDF engines (Section|T2j, 2) We analyze 
the effect of the type-aware transformation and the series of opti¬ 
mizations (Section|73J, and 3) We show the linear speed-up of the 
parallel TurbOHOM++ with an increasing number of threads (Sec¬ 
tion [TAJ. 

7.1 Experiment Setup 

Competitors. We choose three representative RDF engines as 
competitors of TurbOhOM-n- - RDF-3X, TripleBit, and System-X. 
Note that these three systems are publicly available. RDF-3X fH 
is a well-known RDF store, showing good performance for vari¬ 
ous types of SPARQL queries. TripleBit is a very recent RDF 
engine efficiently handling large-scale RDF data. System-X is a 
popular RDF engine exploiting bitmap indexing. We exclude Bit- 
Mat (2] from performance evaluation since it is clearly inferior to 
TripleBit 1^ . gStore is excluded since it is not publicly available. 

Datasets. We use four RDF datasets in the experiment - LUBM 
d, YAGO (H, BTC2012 03], and BSBM d. LUBM is 
a de-facto standard RDF benchmark which provides a synthetic 
data generator. Using the generator, we create three datasets - 
LUBM80, LUBM800, and LUBM8000 where the number repre¬ 
sents the scaling factor. YAGO is a real dataset which consists of 
facts from Wikipedia and the WordNet. BTC2012 is a real dataset 
crawled from multiple RDF web resources. Lastly, BSBM is an 
RDF benchmark which provides a synthetic data generator and 
benchmark queries. BSBM uses more general SPARQL query fea¬ 
tures such as FILTER, OPTIONAL, and UNION. 

In order to support the original benchmark queries in LUBM, we 
load the original triples as well as inferred triples into databases. In 
order to obtain inferred triples, we use the state-of-the-art RDF in¬ 
ference engine. For example, LUBM8000 contains 1068394687 
original triples and 869030729 inferred triples. Note that this is the 
standard way to perform the LUBM benchmark. However, regard¬ 
ing BTC2012, we use the original triples only for database loading. 
This is because the BTC2012 dataset contains many triples that vi¬ 
olate the RDF standard, and thus the RDF inference engine refuses 
to load and execute inference for the BTC2012 dataset. BSBM 
contains 986410726 original triples and 11412064 inferred triples. 

Table [T] shows the number of vertices and edges of the graphs 
transformed by the direct transformation and the type-ware trans¬ 
formation. The reduced number of edges in the type-aware trans¬ 
formed graph directly affects the amount of graph exploration in 
e-graph homomorphism matching. 

Queries. Regarding LUBM, we use the 14 original benchmark 
queries provided in the websit^J Previous work such as ||2^ and 


Table 1: Graph size statistics (direct: direct transformation, 
type-aware: type-aware transformation). 



\V\ direct 

|i7| direct 

\V\ type-aware 

|i7| type-aware 

LUBM80 

2644579 

19461754 

2644573 

12357312 

LUBM800 

26304872 

193691328 

26304863 

122994224 

LUBM8000 

263133301 

1937425416 

263133295 

1230263406 

BTC2012 

367728453 

1436545556 

367459811 

1185887764 

BSBM 

223938701 

997822791 

1937425416 

893575906 


{ 2 ^ modified some of the original queries because executing those 
original queries without the inferred triples returns an empty result 
set. Regarding YAGO and BTC2012, we use the same query sets 
proposed in flal and ll^ . because they do not have official bench¬ 
mark queries. Some queries in the YAGO query set contain pred¬ 
icates which do not exist in the YAGO dataset. We replace such 
predicates in queries with the predicates in the dataset that have the 
closest meaning. For example, the predicate bornInLocation 
in Ql, Q5, and Q6 is replaced with bornin. Regarding BSBM, 
we used 12 queries in the explore use caseQ which contain OP¬ 
TIONAL, FILTER, and UNION keywords which test the capability 
of more general SPARQL query support. 

In order to measure the pure subgraph matching performance, (1) 
we omit modifiers which reorganize the subgraph pattern matching 
results (e.g. DISTINCT and ORDER BY) in all queries and (2) we 
measure the elapsed time excluding the dictionary look-up time. 

Running Environment. We conduct the experiments in a server 
running Linux four Intel Xeon E5-4640 CPUs and 1.5TB RAM. 
The server has the NUMA (II [H architecture with 4 sockets in 
which each socket has its own CPU and local memory. 

We measure the elapsed times with a warm cache. To do that, 
we set up the competitors’ running environment as follows. Eor 
RDP-3X and TripleBit, as done in 1^ . we put the database files in 
the tmpfs in-memory filesystem, which is a kind of RAM disk. Eor 
System-X, we set the memory buffer size to 4QQGB, which is suf¬ 
ficient for loading the entire database in memory. We execute every 
query five times, exclude the best and worst times, and compute the 
average of the remaining three. 

7.2 Comparison between TurbOrfOM++ and 
RDF engines 

We report the elapsed times of the benchmark queries using a 
single thread. Since the server has a NUMA architecture, memory 
allocation is always done within one CPU’s local memory. 

LUBM. Table 1^ shows the number of solutions for all bench¬ 
mark queries in all LUBM datasets. Table shows experimen¬ 
tal results for LUBM80, LUBM800, and LUBM8000. Note that 
Triplebit was not able to return correct answers for two queries over 
LUBM80/LUBM800 and for ten queries over LUBM8000. In Ta- 
ble[3 we use ’X’ or the superscript over the elapsed times when 
TripleBit returns incorrect numbers of solutions. 

In order to analyze results in depth, we classify the LUBM 
queries into two types. The first type of queries has a constant num¬ 
ber of solutions regardless of the dataset size. Ql, Q3 ~ Q5, Q7, 
Q8, and QIO - Q12 belong to this type. These queries are called 
constant solution queries. The other queries (Q2, Q6, Q9, Q13, 
and Q14) have increasing numbers of solutions proportional to the 
dataset size. These queries are called increasing solution queries. 

Regarding the constant solution queries, only TurbOHOM-t-t- 
achieves the ideal performance in LUBM, which means constant 


^http://swat.cse.lehigh.edu/projects/lubm/ 


^http://wifo5-03.informatik.uni-mannheim.de/bizer/berli 










Table 2: Number of solutions in LUBM queries. 


Dataset 

Ql 

Q2 

Q3 

Q4 

Q5 

Q6 

Q7 

Q8 

Q9 

QIO 

Qll 

Q12 

Q13 

Q14 

LUBM80 

4 

212 

6 

34 

719 

838892 

67 

7790 

21872 

4 

224 

15 

380 

636529 

LUBM800 

4 

2003 

6 

34 

719 

8352839 

67 

7790 

218261 

4 

224 

15 

3800 

6336816 

LUBM8000 

4 

2528 

6 

34 

719 

83557706 

67 

7790 

2178420 

4 

224 

15 

37118 

63400587 


Table 3: Elapsed time in LUBM [unit: ms] (X: wrong number of solutions (# of solutions difference > 3) , wrong number of 
solutions (# of solutions difference < 3)). 



Ql 

Q2 

Q3 

Q4 

Q5 

Q 6 

Q7 

Q 8 

Q9 

QIO 

Qll 

Q12 

Q13 

Q14 

TurbOHOM++ 

0.09 

6.37 

0.09 

0.13 

0.13 

4.43 

0.05 

2.26 

101.42 

0.09 

0.10 

0.10 

0.06 

3.08 

RDF-3X 

3.09 

188.90 

4.09 

12.37 

14.74 

375.04 

91.06 

58.32 

770.32 

3.19 

2.35 

3.52 

15.08 

262.41 

TripleBit 

2.56 

86.09 

12.82 

5.26 

18.92 

165.93 

24.76 

48.22* 

X 

9.23 

0.44 

1.86 

19.31 

132.09 

System-X 

2.00 

426.00 

2.00 

4.67 

2.67 

64.33 

4.00 

19.33 

3512.00 

2.00 

2.33 

4.67 

5.67 

47.00 


(a) LUBM80. 



Ql 

Q2 

Q3 

Q4 

Q5 

Q6 

Q7 

Q8 

Q9 

QIO 

Qll 

Q12 

Q13 

Q14 

TurbOHOM++ 

0.09 

124.13 

0.09 

0.13 

0.13 

25.70 

0.05 

2.32 

1239.46 

0.09 

0.10 

0.09 

0.12 

19.72 

RDF-3X 

4.15 

2473.01 

5.17 

16.50 

25.02 

5103.35 

840.16 

461.80 

10033.57 

3.83 

7.48 

7.09 

100.16 

3607.13 

TripleBit 

23.32 

3548.58* 

142.29 

15.76 

183.46 

2309.57 

187.39 

181.20* 

X 

109.47 

2.84 

3.51 

161.65 

1818.52 

System-X 

2.67 

4394.00 

2.00 

4.67 

3.00 

239.33 

4.33 

21.00 

175040.33 

2.00 

2.33 

4.00 

29.00 

186.33 


(b) LUBM800. 



Ql 

Q2 

Q3 

Q4 

Q5 

Q6 

Q7 

Q8 

Q9 

QIO 

Qll 

Q12 

Q13 

Q14 

TurbOHOM++ 

0.10 

309.74 

0.09 

0.12 

0.13 

191.52 

0.05 

1.61 

5238.79 

0.09 

0.11 

0.10 

0.83 

149.53 

RDF-3X 

4.31 

30492.93 

4.87 

19.53 

94.89 

65453.67 

8476.19 

4201.81 

131053.33 

4.15 

23.27 

12.83 

630.91 

48285.17 

TripleBit 

X 

X 

X 

X 

2348.87 

18974.80 

X 

X 

X 

1251.25 

X 

X 

X 

14197.47 

System-X 

2.67 

41449.33 

2.67 

5.00 

3.00 

1519.67 

4.33 

42.67 

3123629.67 

2.67 

2.33 

5.00 

88.00 

1155.00 


(c) LUBM8000. 


performance regardless of dataset size. This phenomenon is an¬ 
alyzed as follows. Each constant solution query contains a query 
vertex whose ID attribute is set to an entity in the RDF graph. Thus, 
TurbOHOM++ chooses that query vertex as a starting query ver¬ 
tex and generates a candidate region. Furthermore, in the LUBM 
datasets, although we increase the scaling factor in order to increase 
the database size, the size of the candidate region explored by every 
constant solution query remains almost the same. 

In contrast, the elapsed times of RDF-3X increase as the dataset 
size increases. This is because the data size to scan for merge join 
increases as the dataset size increases. Thus, the performance gap 
between TurbOHOM++ and RDF-3X increases as the dataset size 
increases. In LUBM80, TurbOHOM-i-i- is 23.50 (QIl) ~ 1821.20 
(Q7) times faster than RDF-3X. In LUBM800, TurbOHOM++ out¬ 
performs RDF-3X by 42.56 (QIO) ' 16803.20 (Q7) times. In 
LUBM8000, TurbOhOM-n- outperforms RDF-3X by 43.10(Q1) 
169523.80 (Q7) times. TripleBit shows a similar trend as RDF-3X. 
Accordingly,TurbOHOM-n- is 4.40 (Qll in LUBM80) ' 18068.23 
(Q5 in LUBM8000) times faster than TripleBit. System-X shows 
constant elapsed times for these queries, although it is consistently 
slower than TurbOhOM-n- by up to 86.60 times. 

For the increasing solution queries (Q2, Q 6 , Q9, Q13, and 
Q14), TurbOhOM-n- also shows the best performance in all LUBM 
datasets. Overall, the elapsed times of TurbOHOM++ are propor¬ 
tional to the number of solutions for these queries. Specifically, 
after type-aware transformation, Q13 has one query vertex whose 
ID attribute is set to an entity in the data graph. Thus, the number 
of candidate regions is one, which is similar to the constant solution 
query. However, as the dataset size increases, the candidate region 
size also increases. The other queries (Q2, Q 6 , Q9, Q14) do not 
have any query vertex whose ID attribute is set to an entity in the 
data graph. As the dataset increases, the number of candidate re¬ 


gions for these queries increases, while each candidate region size 
does not change. All systems show the increasing elapsed time as 
the dataset size increases. RDF-3X shows 7.60 (Q9 in LUBM80) 

760.13 (Q13 in LUBM8000) times longer elapsed times than 
TurbOHOM++- TripleBit shows 13.51 (Q2 in LUBM80) ~ 1347.08 
(Q13 in LUBM800) times longer elapsed time than TurbOhOM-n- 
when considering the queries which have the right number of solu¬ 
tions. System-X shows 7.72 (Q14 in LUBM8000) ' 596.25 (Q9 in 
LUBM8000) times longer elapsed time than TurbOhOM-n-. For the 
constant solution query, System-X seems to be the best competi¬ 
tor of TurbOHOM-i-i-. However, regarding the most time-consuming 
queries (Q2, Q9), System-X shows poor performance. 

YAGO. Since the YAGO dataset contains only about 50 million 
triples, all engines process the queries very efficiently. Unlike the 
LUBM queries, the YAGO queries have only a few variables which 
are set to types. Nevertheless, TurbOhOM-n- exhibits the best per¬ 
formance for all YAGO queries. Table |4] shows the exact number 
of solutions and elapsed times in YAGO. 

Specifically, TurbOHOM++ outperforms RDF-3X and System-X 
by up to 25.95 and 15.01 times. This performance improvement 
is due to good matching order selection and the series of optimiza¬ 
tions in the optimized TurbOHOM++- Again, TripleBit returns in¬ 
correct numbers of solutions for all queries except Q2. 

BTC2012. Table shows the exact number of solutions and 
elapsed times in BTC2012. Even though BTC2012 contains over 
1-billion triples, all the engines process all BTC2012 queries quite 
efficiently. This is because the shapes of query graphs are simple 
(tree-shaped). Furthermore, like LUBM, Q2, Q4, and Q5 in the 
BTC2012 query set contain one query vertex whose ID attribute 
is set to an entity in the RDF graph. Still, TurbOHOM-i-i- outper¬ 
forms RDF-3X, TripleBit, and System-X by up to 422.60, 28.57, 
and 266.18 times, respectively. 


































































Table 4: Number of solutions and elapsed time [unit: ms] in 
YAGO. 



Ql 

Q2 

Q3 

Q4 

Q5 

Q6 

Q7 

Q8 

# of sol. 

196 

0 

2129 

3150 

12611 

2006 

43238 

91 

TurbOHOM++ 

1.33 

0.13 

16.93 

1.19 

3.61 

20.52 

31.35 

4.04 

RDF3X 

18.91 

32.85 

66.74 

52.15 

45.17 

24.64 

595.02 

16.78 

TripleBit 

X 

1.03 

X 

X 

X 

X 

X 

X 

System-X 

11.33 

19.00 

39.33 

13.33 

11.33 

95.33 

780.67 

79.00 


Table 5: Number of solutions and elapsed time [nnit: ms] in 
BTC2012. 



Ql 

Q2 

Q3 

Q4 

Q5 

Q6 

Q7 

Q8 

# of sol. 

4 

4 

1 

4 

13 

1 

664 

5996 

TurbOHOM++ 

0.12 

0.16 

0.96 

0.89 

0.18 

2.49 

36.81 

1.99 

RDF3X 

6.67 

7.52 

10.42 

13.07 

69.97 

22.75 

392.73 

841.96 

TripleBit 

1.56 

1.81* 

0.98 

6.94 

5.20 

3.52 

133.64* 

X 

System-X 

8.00 

4.67 

5.00 

12.33 

4.67 

663.67 

110.67 

351.67 


BSBM.Table[^shows the exact number of solutions and elapsed 
times in BSBM. The open source RDF engines, RDF-3X and 
TripleBit, are excluded as they do not support OPTIONAL and 
FILTER. Like BTC2012, even though BSBM contains about 1- 
billion triples, TurbOHOM++ processes most BSBM queries less 
than 5ms except Q5 and Q6. That is because they have a small 
number of solutions and contain one query vertex whose ID at¬ 
tribute is set to an entity in the RDF graph. For those ten queries, 
TurbOHOM++ outperforms System-X by 2.37 ~ 7284.47 times. Q5 
and Q6 take longer than the other queries because they use expen¬ 
sive filters such as join conditions (Q5) and a regular expression 
(Q6) and filter out a large number of solutions after basic graph 
pattern matching is finished. Before evaluating FILTER, Q5 (Q6) 
has 178030 (2848000) solutions from the query graph pattern and 
only qualifies 6803 (43508) final solutions. 

Table 6: Number of solutions and elapsed time [nnit: ms] in 
BSBM. 



Ql 

Q2 

Q3 

Q4 

Q5 

Q6 

# of sol. 

79 

17 

202 

142 

6803 

43508 

TurbOHOM++ 

0.58 

0.15 

8.15 

1.27 

344.66 

3969.18 

System-X 

10 

1092.67 

19.33 

21.67 

589.67 

9889.00 


Ql 

Q8 

Q9 

QIO 

Qll 

Q12 

# of sol. 

2 

1 

21 

3 

10 

1 

TurbOHOM++ 

0.25 

0.16 

0.11 

0.23 

0.14 

0.12 

System-X 

23.33 

12.33 

4.00 

11.00 

3.00 

8.00 


7.3 Effect of Improvement Techniques 

We measure the effect of the improvement techniques including 
the type-aware transformation (Section|4T} and the four optimiza¬ 
tions (Section l4.3t . For this purpose, we use the largest LUBM 
dataset, LUBM8000. We first show the effect of the type-aware 
transformation because it is beneficial to all LUBM queries. We 
next show the effect of the four optimizations (Section r7.3.2b . 

7.3.1 Effect of Type-aware Transformation 
Table |7] shows the elapsed times for the LUBM queries in 
LUBM8000 using the direct transformation (TurbOnoM) and the 
type-aware transformation (TurbOHOM++ without optimizations). 


Compared with the direct transformation, the type-aware transfor¬ 
mation improves the query performance by l.Ol(Ql) to 27.22(Q6). 

The obvious reason for performance improvement is the smaller 
query sizes after the type-aware transformation. The reduced sized 
query graph leads to smaller size candidate regions and shorter 
elapsed times. First of all, Q6 and Q14 benefit the most from the 
type-aware transformation. After the type-aware transformation, 
these queries become point-shaped. That is, solutions of these two 
queries are directly obtained by iterating the data vertices which 
have the vertex label of the query vertex, which corresponds to 
lines 2-A in Algorithm [T] Q13 also benefits much from the type- 
aware transformation, since the type-aware transformation chooses 
a better starting query vertex than the direct transformation which 
chooses a query vertex having type information. Ql, Q3, Q4, Q5, 
Q7, Q8, QIO, Qll, and Q12 do not benefit from the type-aware 
transformation because they already have a small number of candi¬ 
date vertices under the direct transformation. 

Q2 benefits less than the other long running queries from 
the type-aware transformation. The following is the pro¬ 
filing result of Q2 with the direct/type-aware transformation. 
Q2 with direct transformation takes 26774.73 milliseconds 
in ExploreCandidateRegion and 31191.29 milliseconds in 
Subgraphs ear ch. Note that, with direct transformation, the start¬ 
ing vertex is arbitrarily chosen from uo, ui, U 2 in Figure 
since they all have same vertex label frequency (freq{g, L{ui)) = 
l,i = 0,1,2) and the same degree of 1. In our implementa¬ 
tion, the first query vertex uq is chosen and thus the label of the 
non-tree edge is SUbOrganizationOf. However, with type-aware 
transformation, the starting vertex is ui in Figure]^ and the la¬ 
bel of the non-tree edge is memberOf. Although the number of 
candidate regions with ui is the minimum among uo, ui, and U 2 , 
the cost of IsJoinable calls for memberOf increases 1.30 times. 
Thus, Q2 with type-aware transformation takes 9523.60 millisec¬ 
onds in ExploreCandiateRegion and 40469.47 milliseconds in 
SubgraphSearch. We achieve only 1.16 times performance im¬ 
provement. However, the cost of the IsJoinable call is signif¬ 
icantly reduced by using +INT. Thus, after applying type-aware 
transformation and the tailored optimizations, the final elapsed time 
for Q2 becomes 309.74ms, i.e., 187.15 times performance im¬ 
provement compared with direct transformation only. 


7.3.2 Effect ofEour Optimizations 

In this experiment, we measure the effect of four optimizations 
of TurbOHOM++- We use Q2 and Q9 in LUBM8000 since these two 
queries in LUBM8000 are the most time-consuming and exploit all 
optimizations. All the other queries are omitted since their elapsed 
times are too short, so that it is hard to recognize the effect of op¬ 
timization. Note that the elapsed times of Ql, Q3 ~ Q5, Q7, Q8, 
QIO ~ Q13 are too short (< 2ms), and Q6 and Q14 do not benefit 
from these optimizations since they are point-shaped. 

FigurefTSlshows the reduced times of Q2 and Q9 in LUBM8000 
after applying these optimizations separately. The optimization 
techniques in X-axis are ordered by the reduced in a decreasing 

manner-tlNT, -NLF, -DEG, and -i-REUSE. Interestingly, even 

though Q2 and Q9 have the same shape (i.e., trianglular), the most 
effective optimizations were different. +INT was the most effective 
in Q2. -NLF was the most effective in Q9 since the size of each 
candidate region was very small. -DEG was more effective in Q9 
than in Q2 since Q9 has more data vertices applied to the degree 
filter. -tREUSE was effective in Q9 which has large number of 
candidate regions while Q2 did not benefit from -i-REUSE. 






































Table 7: Effect of type-aware transformation in LUBM8000 (Performance gain = Direct transformation Type-aware transforma¬ 
tion). 



Qi 

Q2 

Q3 

Q4 

Q5 

Q6 

Q7 

Q8 

Q9 

QIO 

Qll 

Q12 

Q13 

Q14 

Direct transformation (ms) 

0.101 

57966.93 

0.11 

0.16 

0.43 

5218.47 

0.15 

5.63 

114116.33 

0.10 

0.21 

0.30 

21.48 

3886.43 

Type-aware transformation (ms) 

0.100 

50016.13 

0.09 

0.14 

0.13 

191.69 

0.05 

1.73 

17829.50 

0.09 

0.11 

0.10 

1.33 

149.60 

Performance gain 

1.01 

1.16 

1.23 

1.09 

3.34 

27.22 

2.80 

3.25 

6.40 

1.14 

1.95 

3.01 

16.17 

25.98 



-HINT -NLF -DEG -^REUSE 


Figure 15: Reduced elapsed time of each optimization (Elapsed 
time of no-optimization: 50016.13ms (Q2) and 17829.50ms 
(Q9)). 


7.4 Effect of Parallelization 

In the last experiment, we report the parallelization effect of 
TurbOHOM++- Among parallelizable queries (Q2, Q6, Q9, and 
Q14), which have multiple starting data vertices, we choose Q2 and 
Q9. The reasons are 1) these queries are the most time-consuming 
queries, and 2) Q6 and Q14 are point-shaped queries which do not 
involves graph exploration. In order to show the parallelism, we 
allocate the data graph in an interleaved way where each memory 
page allocation is assigned to sockets in a round-robin way. 

We vary the number of threads by 1, 4, 8, 12 and 16. As shown 
in Figure 16, TurbOHOM-i-i- shows super-linear speed-up propor¬ 
tional to the number of threads. In Q2, TurbOHOM-i-i- achieves 5.37, 
10.49, 15.06 and 19.63 speed-up using 4, 8, 12, and 16 threads, 
respectively. In Q9, TurbOhOM-n- achieves 4.87, 8.23, 14.37 and 
16.54 speed-up. In the experiment, though the data graph is evenly 
spread out in 4 sockets, data vertices of a candidate region could not 
be uniformly distributed. That means executing a pattern matching 
of the candidate region in a socket which has more data vertices in 
its local memory for the candidate region is beneficial since less re¬ 
mote memory access is required. In this sense, measuring speed-up 
based on a multiple of 4 threads is more reasonable. When comput¬ 
ing the speedup based on the 4-thread elapsed time, the speed-up 
of Q2 becomes 1.95, 2.80, and 3.65 for 8, 12, and 16 threads. 

o Q2 i Q9 



Figure 16: Speed-up of TurbOHOM++ in Q2 and Q9 in the 
LUBM8000 dataset). 


8. CONCLUSION 

The core function of processing RDF data is subgraph pattern 
matching. There have been two completely different directions for 
supporting efficient subgraph pattern matching. One direction is to 
develop specialized RDF query processing engines exploiting the 
properties of RDF data, while the other direction is to develop effi¬ 
cient subgraph isomorphism algorithms for general, labeled graphs. 
In this paper, we posed an important research question, “Can sub¬ 
graph isomorphism be tamed for efficient RDF processing?” In 
order to address this question, we provided the first direct and 
comprehensive comparison of the state-of-the-art subgraph isomor¬ 
phism method with representative RDF processing engines. 

We first showed that a subgraph isomorphism algorithm requires 
minimal modification to handle a graph homomorphism with edge 
label mapping which is the RDF graph pattern matching seman¬ 
tics. We then provided a novel transformation method, called 
type-aware transformation along with a series of optimization tech¬ 
niques. We next performed extensive experiments using RDF 
benchmarks in order to show the superiority of the optimized sub¬ 
graph isomorphism over representative RDF processing engines. 
Experimental results showed that the optimized subgraph isomor¬ 
phism method achieved consistent and significant speedup over 
those RDF processing engines. 

This study drew a promising conclusion that a subgraph isomor¬ 
phism algorithm tamed for RDF processing can serve as an in¬ 
memory accelerator on top of a commercial RDF engine for real¬ 
time RDF query processing as well. We believe that this approach 
opens a new direction for RDF processing, so that both traditional 
directions can merge or benefit from each other. 

Acknowledgment 

This work was supported in part by a gift from Oracle Labs’ Exter¬ 
nal Research Office. This work was also supported by the National 
Research Foundation of Korea(NRF) grant funded by the Korea 
government(MSIP) (No. NRF-2014R1A2A2A01004454) and the 
MSIP(Ministry of Science, ICT and Future Planning), Korea, un¬ 
der the “ICT Consilience Creative Program” (IITP-2015-R0346- 
15-1007) supervised by the IITP(Institute for Information & com¬ 
munications Technology Promotion). 

References 

[1] D. I. Abadi et al. Sw-store: A vertically partitioned dbms for 
semantic web data management. The VLDB Journal, 385- 
406, 2009. 

[2] M. Atre et al. Matrix ’’bit” loaded: A scalable lightweight join 
query processor for rdf data. In WWTV ’10, 41-50. 

[3] C. Bizer and A. Schultz. The berlin sparql benchmark. Inter¬ 
national Journal on Semantic Web and Information Systems 
(IJSWIS), 1-24, 2009. 

[4] I. Broekstra et al. Sesame; A generic architecture for storing 
and querying rdf and rdf schema. In ISWC '02, 54-68. 

[5] J. Cheng et al. Fg-index: Towards verification-free query pro¬ 
cessing on graph databases. In SIGMOD '07, 857-872. 































[6] W. Fan et al. Graph homomorphism revisited for graph 
matching. VLDB ’10, 1161-1172. 

[7] A. Gubichev and T. Neumann. Exploiting the query structure 
for efficient join ordering in SPARQL queries. In EDBT ’14, 
439^50. 

[8] Y. Guo et al. Lubm: A benchmark for owl knowledge base 
systems. Web Semanl., 158-182, 2005. 

[9] W.-S. Han et al. Turbo/so: towards ultrafast and robust sub¬ 
graph isomorphism search in large graph databases. In SIG- 
MOD ’13, 337-348. 

[10] A. Harth. Billion Triples Challenge data set. Downloaded 
from http://km.aifb.kit.edu/projects/btc-2012/, 2012. 

[11] H. He and A. K. Singh. Graphs-at-a-time; Query language 
and access methods for graph databases. In SIGMOD ’08, 
405^18. 

[12] J. Huang, D. J. Abadi, and K. Ren. Scalable sparql querying 
of large rdf graphs. VLDB ’ll, 1123-1134. 

[13] J. Lee et al. An in-depth comparison of subgraph isomor¬ 
phism algorithms in graph databases. VLDB ’12, 133-144. 

[14] V. Leis et al. Morsel-driven parallelism; A numa-aware query 
evaluation framework for the many-core age. In SIGMOD 
’14, 743-754. 

[15] Y. Li, 1. Pandis, R. Muller, V. Raman, and G. M. Lohman. 
Numa-aware algorithms: the case of data shuffling. In CIDR, 
2013. 

[16] T. Neumann and G. Moerkotte. Characteristic sets: Accurate 
cardinality estimation for rdf queries with multiple joins. In 
ICDL ’ll, 984-994. 

[17] T. Neumann and G. Weikum. x-rdf-3x: fast querying, high 
update rates, and consistency for rdf databases. VLDB ’10, 
256-263. 

[18] T. Neumann and G. Weikum. The rdf-3x engine for scalable 
management of rdf data. The VLDB Journal, 91-113, 2010. 

[19] L. P. Cordelia et al. A (sub)graph isomorphism algorithm 


for matching large graphs. ILLL Trans. Pattern Anal. Mach. 
IntelL, 1367 - 1372, 2004. 

[20] H. Shang et al. Taming verification hardness: An efficient 
algorithm for testing subgraph isomorphism. VLDB ’08, 364- 
375. 

[21] B. Shao et al. Trinity: A distributed graph engine on a mem¬ 
ory cloud. In SIGMOD ’13, 505-516. 

[22] F. M. Suchanek et al. Yago: A large ontology from wikipedia 
and wordnet. Web Semant., 203-217, 2008. 

[23] J. R. Ullmann. An algorithm for subgraph isomorphism. J. 
ACM, 31^2, 1976. 

[24] C. Weiss et al. Hexastore: sextuple indexing for semantic web 
data management. VLDB ’08, 1008-1019. 

[25] K. Wilkinson and K. Wilkinson. Jena property table imple¬ 
mentation. In SSWS ’06, 35^6. 

[26] X. Yan et al. Graph indexing: A frequent structure-based 
approach. In SIGMOD ’04, 335-346. 

[27] X. Yan et al. Graph indexing based on discriminative fre¬ 
quent structure analysis. ACM Trans. Database Syst., 960- 
993, 2005. 

[28] P. Yuan et al. Triplebit; a fast and compact system for large 
scale rdf data. VLDB ’13, 517-528. 

[29] K. Zeng et al. A distributed graph engine for web scale rdf 
data. VLDB ’13, 265-276. 

[30] S. Zhang et al. Treepi: A novel graph indexing method. In 
ICDL ’07, 966 - 975, . 

[31] S. Zhang et al. Gaddi: Distance index based subgraph match¬ 
ing in biological networks. In LDBT ’09, 192-203, . 

[32] P. Zhao and J. Han. On graph query optimization in large 
networks. VLDB ’10, 340-351. 

[33] L. Zou et al. A novel spectral coding in a large graph database. 
In LDBT ’08, 181-192, . 

[34] L. Zou et al. gstore: answering sparql queries via subgraph 
matching. VLDB ’ll, 482-493, . 


